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Abstract 

We  have  carefully  instrumented  a  large  portion  of  the 
population  living  in  a  university  graduate  dormitory  by 
giving  participants  Android  smart  phones  running  our 
sensing  software.  In  this  paper,  we  propose  the  novel 
problem  of  predicting  mobile  application  (known  as 
“apps”)  installation  using  social  networks  and  explain 
its  challenge.  Modem  smart  phones,  like  the  ones  used 
in  our  study,  are  able  to  collect  different  social  net¬ 
works  using  built-in  sensors,  (e.g.  Bluetooth  proximity 
network,  call  log  network,  etc)  While  this  information 
is  accessible  to  app  market  makers  such  as  the  iPhone 
AppStore,  it  has  not  yet  been  studied  how  app  mar¬ 
ket  makers  can  use  these  information  for  marketing  re¬ 
search  and  strategy  development.  We  develop  a  simple 
computational  model  to  better  predict  app  installation 
by  using  a  composite  network  computed  from  the  dif¬ 
ferent  networks  sensed  by  phones.  Our  model  also  cap¬ 
tures  individual  variance  and  exogenous  factors  in  app 
adoption.  We  show  the  importance  of  considering  all 
these  factors  in  predicting  app  installations,  and  we  ob¬ 
serve  the  surprising  result  that  app  installation  is  indeed 
predictable.  We  also  show  that  our  model  achieves  the 
best  results  compared  with  generic  approaches:  our  pre¬ 
diction  results  are  four  times  better  than  random,  and 
reach  almost  45%  prediction  precision  with  45%  recall. 

Introduction 

Recent  research  projects  have  demonstrated  that 

social  networks  correlate  with  individual  behav¬ 

iors,  such  as  obesity  dChristakis  and  Fowler  2007) 
and  diseases  dColizza  et  al.  2007b,  to  name 
two.  Many  large-scale  networks  are  analyzed, 

and  this  field  is  becoming  increasing  popu¬ 

lar  d Eagle,  Macy,  and  Claxton  2010 1  (Leskovec,  Adamic,  and 


We  are  interested  in  studying  the  network-based  pre¬ 
diction  for  mobile  applications  (referred  as  “apps”)  in¬ 
stallation,  as  the  mobile  application  business  is  growing 
rapidly  (lEllison  2010t.  The  app  market  makers,  such  as 
iPhone  AppStore  and  Android  Market,  run  on  almost  all 
modern  smart  phones,  and  they  have  access  to  phone  data 
and  sensor  data.  As  a  result,  app  market  makers  can  infer 
different  types  of  networks,  such  as  the  call  log  network  and 


the  bluetooth  proximity  network,  from  phone  data.  However, 
it  remains  an  unknown  yet  important  question  whether  these 
data  can  be  used  for  app  marketing.  Therefore,  in  this  paper 
we  address  the  challenge  of  utilizing  all  different  network 
data  obtained  from  smart  phones  for  app  installation  predic¬ 
tion. 

It  is  natural  to  speculate  that  there  are  network  effects  in 
users’  app  installation,  but  we  eventually  realize  that  it  was 
very  difficult  to  adopt  existing  tools  from  large-scale  social 
network  research  to  model  and  predict  the  installation  of  cer¬ 
tain  mobile  apps  for  each  user  due  to  the  following  facts: 

1.  The  underlying  network  is  not  observable.  While 
many  projects  assume  phone  call  logs  are  true  so¬ 
cial/friendship  networks  ([Zhang  and  Dantu  2010|l,  others 
may  use  whatever  network  that  is  available  as  the  un¬ 
derlying  social  network.  Researchers  have  discovered 
that  call  network  may  not  be  a  good  approximation 
(Eagle  and  Pentland  2006|l.  On  the  other  hand,  smart 
phones  can  easily  sense  multiple  networks  using  built- 
in  sensors  and  software:  a)  The  call  logs  can  be  used  to 
form  phone  call  networks;  b)  Bluetooth  radio  can  be  used 
to  infer  proximity  networks  (Eagle  and  Pentland  2006)1; 
c)  GPS  data  can  be  used  to  infer  user  moving  pat¬ 
terns,  and  furthermore  their  working  places  and  affilia¬ 
tions  (lEarrahi  and  Gatica-Perez  2010|l;  d)  Social  network 
tools  (such  as  the  Eacebook  app  and  the  Twitter  app)  can 
observe  users’  online  friendship  network.  In  this  work, 
our  key  idea  is  to  infer  an  optimal  composite  network,  the 
network  that  best  describes  app  installation,  from  multi¬ 
ple  layers  of  different  networks  easily  observed  by  mod¬ 
ern  smart  phones,  rather  than  assuming  a  certain  network 
as  the  real  social  network  explaining  app  installation. 

epidemicsj  Ganesh,  Massoulie,  and  Towsley  2005) 
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and  Twitter  networks  ([Yang  and  Leskovec  2010|)  is  based 
on  the  fact  that  network  is  the  only  mechanism  for 
adoption.  The  only  way  to  get  the  flu  is  to  catch  the  flu 
from  someone  else,  and  the  only  way  to  retweet  is  to  see 
the  tweet  message  from  someone  else.  Eor  mobile  app, 
this  is,  however,  not  true  at  all.  Any  user  can  simply  open 
the  AppStore  (on  iPhones)  or  the  Android  Market  (on 
Android  phones),  browse  over  different  lists  of  apps,  and 
pick  the  one  that  appears  most  interesting  to  the  user  to 
install  without  peer  influence.  One  big  challenge,  which 
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makes  modeling  the  spreading  of  apps  difficult,  is  that 
one  can  install  an  app  without  any  external  influence 
and  information.  One  major  contribution  of  this  paper  is 
that  we  demonstrate  it  is  still  possible  to  build  a  tool  to 
observe  network  effects  with  such  randomness. 

3.  The  individual  behavioral  variance  in  app  installation  is 
so  significant  that  any  network  effect  might  possibly  be 
rendered  unobservable  from  the  data.  For  instance,  some 
geek  users  may  try  and  install  all  hot  apps  on  the  market, 
while  many  inexperienced  users  find  it  troublesome  even 
to  go  through  the  process  of  installing  an  app,  and  as  a 
result  they  only  install  very  few  apps. 

4.  There  are  exogenous  factors  in  the  app  installation  behav¬ 
iors.  One  particular  factor  is  the  popularity  of  apps.  For  in¬ 
stance,  the  Pandora  Radio  app  is  vastly  popular  and  highly 
ranked  in  the  app  store,  while  most  other  apps  are  not.  Our 
model  takes  this  issue  into  account  too,  and  we  show  that 
exogenous  factors  are  important  in  increasing  prediction 
precision. 

Classic  diffusion  models  such  as  Granovetter’s 
work  (jGranovetter  and  Soong  1983)  are  applicable  to 
simulation,  but  lack  data  fitting  and  prediction  powers. 
Statistical  analysis  used  by  social  scientists  such  as  matched 


sample  estimation  (|Aral,  Muchnik,  and  Sundararajan  200911 
are  only  for  identifying  network  effects  and  mechanism. 
Recently  works  in  computer  science  for  inferring  network 
structure  assume  simple  diffusion  mechanism,  and  are 
only  applicable  to  artificial  simulation  data  on  real  net¬ 


works  (Gomez  Rodriguez,  Leskovec,  and  Krause  20101  (Myers 


On  the  other  hand,  our  work  addresses  the  above  is¬ 
sues  in  practical  app  marketing  prediction.  On  the 
mobile-based  behavioral  prediction  side.  The  closest 
research  is  the  churn  prediction  problem  in  mobile  net¬ 
works  (|Richter,  Yom-Tov,  and  Slonim  2010]),  which  uses 
call  logs  to  predict  users’  future  decisions  of  switching 
mobile  providers.  To  our  knowledge,  we  don’t  see  other 
related  works  for  similar  problems. 

Data 

We  collected  our  data  from  March  to  July  2010  with  55  par¬ 
ticipants,  who  are  residents  living  in  a  married  graduate  stu¬ 
dent  residency  of  a  major  US  university.  Each  participant 
is  given  an  Android-based  cell  phone  with  a  built-in  sens¬ 
ing  software  developed  by  us.  The  software  runs  in  a  pas¬ 
sive  manner,  and  it  didn’t  interfere  the  normal  usage  of  the 
phone. 

Our  software  is  able  to  capture  all  call  logs  in  the  ex¬ 
periment  period.  We  therefore  obtained  a  call  log  network 
between  all  participants  by  treating  participants  as  nodes 
and  the  number  of  calls  between  two  nodes  as  weights 
for  the  edge  in-between.  The  software  also  scans  near¬ 
by  phones  and  other  Bluetooth  devices  every  five  min¬ 
utes  to  capture  the  proximity  network  between  individu¬ 
als.  The  counts  on  the  number  of  Bluetooth  hits  are  used 
as  edge  weights  similar  to  the  call  log  network  as  done  in 
Eagle  et  al  (Eagle  and  Pentland  2006|l.  We  have  also  col¬ 
lected  the  affiliation  network  and  the  friendship  network 
by  deploying  a  survey,  which  lists  all  the  participants  and 


ask  each  one  to  list  their  affiliations  (i.e.  the  academic  de¬ 
partment),  and  rate  their  relationships  with  everyone  else 
in  the  study.  We  believe  for  app  market  makers  the  af¬ 
filiation  network  can  also  be  inferred  simply  by  using 
phone  GPS/cell  tower  information  as  shown  by  Earrahi  et 
aldEarrahi  and  Gatica-Perez  20101).  However,  this  is  not  the 
focus  of  this  work,  and  here  we  simply  use  survey  data  in¬ 
stead.  Though  the  friendship  network  is  also  collected  using 
surveys,  we  suggest  that  the  app  market  makers  can  obtain 
the  friendship  network  from  phones  by  collecting  data  from 
social  networking  apps  such  as  the  Eacebook  and  Twitter 
apps.  We  summarize  all  the  networks  obtained  from  both 
phones  and  surveys  in  Table  [T]  We  refer  to  all  networks  in 
Table  m  as  candidate  networks,  and  all  candidate  networks 
will  be  used  to  compute  the  optimal  composite  network.  It 
should  be  noted  that  all  networks  are  reciprocal  in  this  work. 

We  want  to  emphasize  the  fact  that  the  network  data  we 
used  in  Table[T]are  obtainable  for  app  market  makers  such  as 
Apple  iTunes  Store,  as  they  have  access  to  phone  sensors  as 
well  as  user  accounts.  Therefore,  our  approach  in  this  paper 
can  be  beneficial  to  them  for  marketing  research,  customized 
app  recommendation  and  marketing  strategy  making. 

Our  built-in  sensing  platform  is  constantly  monitoring  the 
installation  of  mobile  apps.  Every  time  a  new  app  is  in¬ 
stalled,  this  information  will  be  collected  and  sent  back  to 
our  server  within  a  day.  Overall,  we  receive  a  total  of  821 
apps  installed  by  all  55  users.  Among  them,  173  apps  have 
at  least  two  users.  Eor  this  analysis,  we  only  look  at  app  in- 
stallations  and  ignore  un-installations.  We  first  demonstrate 
andfeesfeyfeEffllQ  £  the  apps  in  the  study:  In  Eig.  |l(a)|  we  plot 
the  distribution  of  number  of  users  installing  each  app.  We 
discover  that  our  data  correspond  very  well  with  a  power- 
law  distribution  with  exponential  cut.  In  Eig.  |l(b)|  we  plot 
the  distribution  of  number  of  apps  installed  per  user,  which 
fits  well  with  an  exponential  distribution. 

Eig.  |l(a)|  and  |l(b)|  illustrate  detailed  insight  into  our 
dataset.  Even  with  a  small  portion  of  participants,  the  distri¬ 
bution  characteristic  is  clearly  observable.  We  find  that  apps 
have  a  power-law  distribution  of  users,  which  suggests  that 
most  apps  in  our  study  community  have  a  very  small  user 
pool,  and  very  few  apps  have  spread  broadly.  The  exponen¬ 
tial  decay  in  Eig.  |l(b)|  suggests  that  the  variance  of  individ¬ 
ual  user  is  significant:  There  are  users  having  more  than  100 
apps  installed,  and  there  are  users  having  only  a  couple  of 
apps. 

Model 

In  this  section,  we  describe  our  novel  model  for  capturing 
the  app  installation  behaviors  in  networks.  In  the  following 
content,  G  denotes  the  adjacency  matrix  for  graph  G.  Each 
user  is  denoted  byu  S  Each  app  is  denoted  by 

a  G  {1, ...,  A}.  We  define  the  binary  random  variable  to 
represent  the  status  of  adoption  (i.e.  app  installation):  a;“  = 
1  if  o  is  adopted  by  user  n,  0  if  not. 

As  introduced  in  the  previous  section,  the  different  social 
relationship  networks  that  can  be  inferred  by  phones  are  de¬ 
noted  by  G^, ...,  G^.  Our  model  aims  at  inferring  an  opti¬ 
mal  composite  network  G°p‘  with  the  most  predictive  power 
from  all  the  candidate  social  networks.  The  weight  of  edge 


Network 

Type 

Source 

Notation 

Call  Log  Network 

Undirected,  Weighted 

#  of  Calls 

Gc 

Bluetooth  Proximity  Network 

Undirected,  Weighted 

#  of  Bluetooth  Scan  Hits 

Qb 

Friendship  Network 

Undirected.Binary 

Survey  Results  (1;  friend;  0;  not  friend) 

Gf 

Affiliation  Network 

Undirected.Binary 

Survey  Results  (1;  same;  0;  different) 

Table  1 ;  Network  data  used  in  this  study. 


(a)  (b) 

Figure  1 :  Circles  are  real  data,  and  lines  are  fitting  curves.  Left;  Distribution  of  number  of  users  for  each  app.  Right;  Distribution 
of  number  of  apps  each  user  installed. 


Cij  in  graph  G™  is  denoted  by  .  The  weight  of  an  edge 
in  G°P'  is  simply  denoted  by  Wi^j . 

Adoption  Mechanism 

One  base  idea  of  our  model  is  the  non-negative  accumulative 
assumption,  which  distinguishes  our  model  from  other  linear 
mixture  models.  We  define  G°p‘  to  be; 

G°P‘  =  a^G”",  where  Vm,  Um  >  0.  (1) 

m 

The  intuition  behind  this  non-negative  accumulative  as¬ 
sumption  is  as  follows;  if  two  nodes  are  connected  by  a  cer¬ 
tain  type  of  network,  their  app  installation  behaviors  may 
or  may  not  correlate  with  each  other;  On  the  other  hand,  if 
two  nodes  are  not  connected  by  a  certain  type  of  network, 
the  absence  of  the  link  between  them  should  lead  to  nei¬ 
ther  positive  or  negative  effect  on  the  correlation  between 
their  app  installations.  As  shown  in  Table  |2]  in  the  exper¬ 
iment  session,  our  non-negative  assumption  brings  signifi¬ 
cant  performance  increase  in  prediction.  Non-negative  as¬ 
sumption  also  makes  the  model  stochastic  and  theoretically 
sound.  We  treat  binary  graphs  as  weighted  graphs  as  well. 
Since  ai, ...,  aM  is  the  non-negative  weights  for  each  can¬ 
didate  network  in  describing  the  optimal  composite  network. 
We  later  refer  to  the  vector  {ai, ...,  om)  as  the  optimal  com¬ 
posite  vector.  Our  non-negative  accumulative  formulation  is 
also  similar  to  mixture  matrix  models  in  machine  learning 
literature  (lEl-Yaniv,  Pechyony,  and  Yom-Tov  2008 1. 

We  continue  to  define  the  network  potential 

Pa{i)  =  Y  (2) 


where  the  neighbor  of  node  i  is  defined  by; 

M{i)  =  {j\3m  s.t.  >  0}.  (3) 

The  potential  Pa{i)  can  also  be  decomposed  into  poten¬ 
tials  from  different  networks; 

m 

^  ^  > 

p™(i) 

where  p^{i)  is  the  potential  computed  from  one  single  can¬ 
didate  network.  We  can  think  of  Pa{i)  as  the  potential  of 
i  installing  app  a  based  on  the  observations  of  its  neighbors 
on  the  composite  network.  The  definition  here  is  also  similar 
to  incoming  influence  from  adopted  peers  for  many  cascade 
models  (iKempe,  Kleinberg,  and  Tardos  2003 1. 

Finally  our  conditional  probability  is  defined  as; 

Prob(a;“  =  l|a:“,  ;  u  €  A/'(u))  =  1  -  exp(-Su  -pa{u)), 

(5) 

where  Vu,  Su  >  0.  Su  captures  the  individual  susceptibility 
of  apps,  regardless  of  which  app.  We  use  the  exponential 
function  for  two  reasons; 

1.  The  monotonic  and  concave  properties  of  f{x)  =  1  — 
exp(— x)  matches  with  recent  research  jCentola  20101). 
which  suggests  that  the  probability  of  adoption  increases 
at  a  decreasing  rate  with  increasing  external  network  sig¬ 
nals. 

2.  It  forms  a  concave  optimization  problem  during  maxi¬ 
mum  likelihood  estimation  in  model  training. 

As  shown  in  the  experiment  section  and  based  on  our  expe¬ 
riences,  this  exponential  model  yields  the  best  performance. 


Model  Training 

We  move  on  to  discuss  model  training.  During  the  train¬ 
ing  phase,  we  want  to  estimate  the  optimal  values  for  the 
Qfi, aM  and  si, sjj-  We  formalize  it  as  an  optimiza¬ 
tion  problem  by  maximizing  the  sum  of  all  conditional  like¬ 
lihood. 

Given  all  candidate  networks,  a  training  set  composed  of 
a  subset  of  apps  TRAIN  C  A},  and  {a:“  :  Va  G 

TRAIN,  u  €  {1, ...,  U}},  we  compute: 

arg  max  /(si, ...,  sc/,  ai, ...,  om), 

Subject  to:  Vu,  s„  >  0,  Vm,  am  >  0  (6) 

where: 


/(si, ...,  S(7,ai, ...,  anf) 


=  log 


n  n 

-  oGTRAIN  u\x^  —  1 


IK,  :  u'  &M{u)) 


n  (1  “  Prob(K  =  IK'  :  u'  G  M{u))) 


GP,  which  can  be  easily  plugged  into  our  composite  network 
framework.  is  constructed  by  adding  a  virtual  node  U+1 
and  one  edge  eu+i,u  for  each  actual  user  u.  The  correspond¬ 
ing  weight  of  each  edge  wu+i,u  for  computing  (u)  is  (7“, 
where  (7“  is  a  positive  number  describing  the  popularity  of 
an  app.  In  our  experiment,  we  use  the  number  of  installations 
of  the  app  in  this  experimental  community  as  (7“.  We  have 
been  looking  at  other  sources  to  obtain  reliable  estimates  for 
G°‘,  but  we  found  that  the  granularity  from  public  sources  to 
be  unsatisfying.  In  practice  for  app  market  makers,  we  argue 
that  (7“  can  be  easily  obtained  accurately  by  counting  app 
downloads  and  app  ranks. 

The  exogenous  factors  also  increase  accuracy  in  measur¬ 
ing  network  effects  for  a  non-trivial  reason:  Considering 
a  network  of  two  nodes  connected  by  one  edge,  and  both 
nodes  installed  an  app.  If  this  app  is  very  popular,  then  the 
fact  that  both  nodes  have  this  app  may  not  imply  a  strong 
network  effect.  On  the  contrary,  if  this  app  is  very  uncom¬ 
mon,  the  fact  that  both  nodes  have  this  app  implies  a  strong 
network  effect.  Therefore,  introducing  exogenous  factors 
does  help  our  algorithm  better  calibrate  network  weights. 


E 

a  STRAIN 


E  log(l  -  exp(-s„ -Pa(u)) 


-  E  i^u+Paiu)) 


u:xf;—0 


(7) 

(8) 


This  is  a  concave  optimization  problem.  Therefore,  global 
optimal  is  guaranteed,  and  there  exist  efficient  algorithms 
scalable  to  larger  datasets  (|Boyd  and  Vandenberghe  2004|l. 
We  use  a  MATLAB  built-in  implementation  here,  and  it  usu¬ 
ally  take  a  few  seconds  during  optimization  in  our  experi¬ 
ments. 

Compared  with  works  on  inferring  net- 


Experiments 

Our  algorithm  predicts  the  probability  of  adoption  (i.e. 
installing  an  app)  given  its  neighbor’s  adoption  status. 
Pi  G  [0, 1]  denotes  the  predicted  probability  of  instal¬ 
lation,  while  Xi  G  {0, 1}  denotes  the  actual  outcome. 
The  most  common  prediction  measure  is  the  Root  Mean 

Square  Error  (RMSE  =  ^ i  SILi  (Pi  ~  This  mea¬ 
sure  is  known  to  assess  badly  the  prediction  method’s  abil¬ 
ity  (IGoel  et  al.  20101.  Since  in  our  dataset  most  users  have 
installed  very  few  apps,  a  baseline  approach  can  simply  pre¬ 
dict  the  same  small  pi  and  still  achieve  very  low  RMSE. 

Eor  app  marketing,  the  key  objective  is  not  to  know 
the  probability  prediction  for  each  app  installation,  but  to 
rank  and  identify  a  sub-group  of  individuals  who  are  more 
likely  to  appreciate  and  install  certain  apps  compared  with 


works  (Gomez  Rodriguez,  Leskovec,  and  Krause  2010|  (Myers  andtaad^eis^l01|ilierefore,  we  mainly  adopt  the  approach 


our  work  is  different  as  we  compute  from  existing  can¬ 
didates  networks.  In  addition,  we  don’t  need  any  additional 
regularization  term  or  tuning  parameters  in  the  optimization 
process. 

We  emphasize  that  our  algorithm  doesn’t  distinguish  the 
causality  problem  (Aral,  Muchnik,  and  Sundararajan  2009| 
in  network  effects:  i.e., we  don’t  attempt  to  understand  the 
different  reasons  why  network  neighbors  have  similar  app 
installation  behaviors.  It  can  either  be  diffusion  (i.e.  my 
neighbor  tells  me),  or  homophily  (i.e.  network  neighbors 
share  same  interests  and  personality).  Instead,  our  focus  is 
on  prediction  of  app  installation,  and  we  leave  the  causality 
problem  as  future  work. 


in  rank-aware  measures  from  information  retrieval  prac¬ 
tices  ([Manning  et  al.  2008|.  Eor  each  app,  we  rank  the  like¬ 
lihood  of  adoption  computed  by  prediction  algorithms,  and 
study  the  following  factors: 

a)  Mean  Precision  at  k  (MP-fc):  We  select  the  top  k  indi¬ 
viduals  with  highest  likelihood  of  adoption  as  predicted 
adopters  from  our  algorithms,  and  compute  precision  at  k 

( - - - 1 — - - — ).  We  average  precisions 

at  k  among  all  apps  in  the  testing  set  to  get  MP-fc.  On 
average  each  app  has  five  users  in  our  dataset.  Therefore, 
the  default  value  for  k  is  five  in  the  following  text.  MP- 
fc  measures  algorithm’s  performance  on  predicting  most 
likely  nodes. 


Virtual  Network  for  Exogenous  Factors 

Obvious  exogenous  factors  include  the  popularity  and  qual¬ 
ity  of  an  app.  The  popularity  and  quality  of  an  app  will 
affect  the  ranking  and  review  of  the  app  in  the  App¬ 
Store/ AppMarket,  and  as  a  result  higher/lower  likelihood  of 
adoption.  We  can  model  this  by  introducing  a  virtual  graph 


b)  Optimal  Fi-score  (referred  later  simply  as  Fi  Score).  The 

optimal  El -score  is  computed  by  computing  Ei -scores 

(2xprecisionxrecall)fo^.  Precision-Recall 

'  precision+recall  '  ^ 

curve  and  selecting  the  largest  Fi  value.  Unlike  MP-fc,  the 
optimal  Fi  score  is  used  to  measure  the  overall  prediction 
performance  of  our  algorithms.  For  instance,  Fi  =  0.5 


suggests  the  algorithm  can  reach  a  50%  precision  at  50% 
recall. 

Prediction  using  Composite  Network 

To  begin  with,  we  illustrate  different  design  aspects  for  our 
algorithm. 

To  demonstrate  the  importance  of  modeling  both  net¬ 
works  and  individual  variances  in  our  model,  we  here 
demonstrate  the  prediction  performance  with  five  configu¬ 
rations  using  a  5-fold  cross-validation;  a)  to  model  both  in¬ 
dividual  variance  and  network  effects;  b)  to  model  both  indi¬ 
vidual  variance  and  network  effects,  but  exclude  the  virtual 
network  capturing  exogenous  factors;  c)  to  model  with 
only  individual  variance  (by  forcing  am  =  0  in  Eq.|6l),  d)  to 
model  with  only  network  effects  (by  forcing  Su  =  0,Vu), 
and  e)  to  model  with  network  effects  while  allowing  the 
composite  vector  to  be  negative.  The  results  are  illustrated 
in  Table  |2] 

We  find  the  surprising  results  that  app  installations  are 
highly  predictable  with  individual  variance  and  network  in¬ 
formation  as  shown  in  Table  |2]  In  addition.  Table  |2]  clearly 
suggests  that  all  our  assumptions  for  the  model  are  indeed 
correct,  and  both  individual  variance  and  network  effects 
play  important  roles  in  app  installation  mechanism,  as  well 
as  the  exogenous  factors  modeled  by  G^. 

We  also  notice  that  while  accuracy  almost  doubles,  it  is 
often  impossible  to  realize  this  improvement  using  RMSE. 
Therefore,  we  will  not  RMSE  for  the  rest  of  the  work. 


Eigure  2:  We  demonstrate  the  prediction  performances  using 
each  single  network  here.  For  comparison,  we  also  show  the 
result  of  random  guess,  and  the  result  using  our  approach, 
which  combines  all  potential  evidence. 

We  now  illustrate  the  prediction  performance  when  our 
algorithm  is  only  allowed  to  use  one  single  network.  The 
results  are  shown  in  Fig.  ID  We  find  that  except  the  affilia¬ 
tion  network,  almost  all  other  networks  predict  well  above 
chance  level.  The  call  log  network  seems  to  achieve  the  best 
results.  We  conclude  that  while  network  effects  are  strong  in 
app  installations,  a  well-crafted  model  such  as  our  approach 
can  vastly  increase  the  performance  by  computing  the  com¬ 
posite  network  and  counting  other  factors  in. 

Prediction  Performance 

We  now  test  the  performance  of  our  model  with  some  other 
implementations  for  predictions.  As  there  is  no  other  closer 


work  related  to  app  prediction  with  multiple  networks,  we 
here  compare  prediction  performance  with  some  alternative 
approaches  we  can  think  of. 

Since  it  is  practically  difficult  to  observe  every  user  app 
installation  behaviors,  in  our  experiments  we  also  want  to 
test  the  performance  of  each  algorithm  when  the  test  set  is 
small.  In  particular,  we  evaluate  the  performance  of  differ¬ 
ent  implementations  with  two  approaches  for  cross  valida¬ 
tion:  1)  Normal-size  training  set:  We  randomly  choose  half 
of  all  the  apps  in  the  dataset  as  the  training  set,  and  test  on 
the  other  half  of  the  dataset.  2)  Small-size  training  set:  We 
randomly  choose  only  20%  of  all  the  app  installations  in  our 
dataset  as  the  training  set,  and  test  on  the  the  rest  80%  apps. 
In  both  cases,  we  repeat  the  process  for  five  times  for  cross 
validation  and  take  average  of  the  results. 

For  our  algorithm,  we  feed  it  with  networks 
GP,G“,G^,G'^  and  obtained  by  phones  and  sur¬ 
veys  as  described  previously.  For  SVM,  we  apply  two 
different  approaches  in  predictions: 

•  We  don’t  consider  the  underlying  network,  but  simply  use 
the  adoption  status  of  all  other  nodes  as  the  features  for 
each  node.  We  test  this  approach  simply  to  establish  a 
baseline  for  prediction.  We  refer  it  as  “SVM-raw”. 

•  We  compute  the  potential  p^(i)  for  each  candidate  net¬ 
work  G™,  and  we  use  all  the  potentials  from  all  candi¬ 
date  networks  as  features.  Therefore,  we  partially  borrow 
some  ideas  from  our  own  model  to  implement  this  SVM 
approach.  We  refer  this  approach  as  “SVM-hybrid”. 

We  use  a  modern  SVM  implementation 
( Chang  and  Lin  2001),  which  is  capable  of  generating 
probabilistic  predictions  rather  than  binary  predictions. 

We  also  replace  Eq.  |5]with  a  linear  regression  model  by 
using  together  with  #  of  apps  per  user  (instead 

of  learning  Su  in  our  MLE  framework)  as  independent  vari¬ 
ables.  We  call  this  approach  “Our  Approach  (Regression)” 
in  the  following  text  to  distinguish  the  difference.  We  also 
force  the  non-negative  accumulation  assumption  in  the  re¬ 
gression  setting. 

Results  for  both  the  normal-size  training  set  and  the  small- 
size  training  set  are  shown  in  Table  |3l  and  we  discover  that 
our  algorithm  outperforms  other  competing  approaches  in 
all  categories.  However,  we  notice  that  with  many  our  model 
assumptions,  generic  methods  can  also  achieve  reasonably 
well  results.  Performance  on  half  of  the  users  that  are  less 
active  in  app  installation  is  also  shown.  Because  this  group 
of  users  are  very  inactive,  they  may  be  more  susceptible  to 
network  influence  in  app  installation  behaviors.  We  notice 
that  our  algorithm  performs  better  in  this  group  with  more 
than  10%  improvement  over  other  methods. 

Predicting  Future  Installations 

In  app  marketing,  one  key  issue  is  to  predict  future  app  in¬ 
stallations.  Predicting  future  app  adoption  at  time  t  in  our 
model  is  equivalent  to  predicting  installation  with  part  of  the 
neighbor  adoption  status  unknown.  These  unknown  neigh¬ 
bors  who  haven’t  adopted  at  time  t  may  or  may  not  adopt  at 
t'  >  t.  Though  our  algorithm  is  trained  without  the  informa¬ 
tion  of  time  of  adoption,  we  show  here  that  the  inferred  in- 


RMSE 

MP-5 

Fi  Score 

Net.H-  Ind.  Var.  +  Exogenous  Factor 

0.25 

0.31 

0.43 

Net.  +  Ind.  Var. 

0.26 

0.29 

0.42 

Ind.  Variance  Only 

0.29 

0.097 

0.24 

Net.  Only  (non-negative) 

0.26 

0.24 

0.37 

Net.  Only  (allow  negative) 

0.30 

0.12 

0.12 

Table  2;  The  performance  of  our  approach  under  five  different  configurations.  We  observe  that  modeling  both  individual  vari¬ 
ance  and  networks  are  crucial  in  performance  as  well  as  enforcing  non-negative  composition  for  candidate  networks  as  in  Eq. 

[I] 


Methods 

Using  20%  as  Training  Set 

All  Users 

Using  50%  as  Training  Set 

All  Users 

Using  50%  as  Training  Set 

Low  Activity  Users 

MP-5 

Fi  Score 

MP-5 

Fi  Score 

MP-5 

Fi  Score 

Our  Approach 

0.28 

0.46 

0.31 

0.43 

0.20 

0.43 

SVM-raw 

0.17 

0.26 

0.24 

0.32 

0.14 

0.27 

SVM-hybrid 

0.14 

0.29 

0.27 

0.30 

0.16 

0.30 

Our  Approach  (Regression) 

0.27 

0.42 

0.30 

0.41 

0.18 

0.39 

Random  Guess 

0.081 

0.17 

0.081 

0.17 

0.076 

0.14 

Table  3:  Prediction  performance  for  our  algorithm  and  competing  methods  is  shown. 


dividual  variance  Su  and  composite  vector  (ai, ...,  aM)  can 
be  used  to  predict  future  app  adoption. 

We  here  apply  the  following  cross-validation  scheme  to 
test  our  algorithm’s  ability  in  predicting  future  installations: 
For  the  adopters  of  each  app,  we  split  them  to  two  equal-size 
groups  by  their  time  of  adoption.  Those  who  adopted  earlier 
are  in  Gl,  and  those  who  adopted  later  are  in  G2.  The  train¬ 
ing  phase  is  the  same  as  the  previous  section;  In  the  testing 
phase,  each  algorithm  will  only  see  adoption  information  for 
subjects  in  Gl,  and  predict  node  adoption  for  the  rest.  The 
nodes  in  G2  will  be  marked  as  non-adopters  during  predic¬ 
tion  phase. 

Results  from  cross  validation  are  shown  in  Table  |4]  We 
notice  that  our  algorithm  still  maintains  the  best  perfor¬ 
mance  and  limited  decrease  in  accuracy  compared  with  Ta¬ 
ble  [3  Since  the  number  of  adopted  nodes  are  fewer  than 
those  in  Table  [3  we  here  show  MP  with  smaller  k  in  Table 

El 


/fc  =  3 

MP-fc 
fc  =  4 

fc  =  5 

Fi  Score 

Our  Approach 

0.18 

0.16 

0.15 

0.35 

SVM-hybrid 

0.15 

0.13 

0.12 

0.32 

Our(Regression) 

0.17 

0.15 

0.14 

0.33 

Random 

0.045 

0.045 

0.045 

0.090 

Table  4:  MP-A:  and  Fi  scores  for  predicting  future  app  in¬ 
stallations  are  shown  above. 

Notice  in  Table  |4]  that  the  random  guess  precision  is  re¬ 
duced  by  half.  Therefore,  even  the  precision  here  is  30% 
lower  than  in  Table  |3  it  is  mainly  due  to  the  fact  that  nodes 
in  Gl  are  no  longer  in  the  predicting  set.  Our  accuracy  is 
considerable  as  it  is  four  times  better  than  random  guess. 


Predictions  With  Missing  Historical  Data 

In  practice,  sometimes  it  is  not  possible  to  observe  the  app 
installation  for  all  users  due  to  privacy  reasons.  Instead,  for 
app  market  markers  they  may  only  be  allowed  to  observe  and 
instrument  a  small  subset  of  a  community.  We  here  want  to 
study  if  it  is  still  possible  to  make  some  prediction  in  app 
installations  under  such  circumstance. 

To  formally  state  this  problem,  we  assume  that  all  the 
nodes  1, ...,  U  are  divided  into  two  groups.  The  observable 
group  Gl  and  the  unobservable  group  G2.  During  cross  val¬ 
idation,  only  nodes  in  the  observable  group  are  accessible  to 
our  algorithms  in  the  training  process,  and  nodes  in  the  un¬ 
observable  group  are  tested  with  the  prediction  algorithms. 
Therefore,  for  our  algorithm,  even  the  individual  variance 
Su,u  €  Gl  is  computed  in  the  training  process,  we  will  not 
have  Su' ,  u'  G  G2  for  Eq.|5]in  the  testing  phase.  We  illustrate 
the  prediction  precision  results  in  Fig.  [3]  It  seems  that  even 
trained  on  a  different  set  of  subjects  without  calibrating  users 
variance,  the  composite  vector  learned  by  our  algorithm  can 
still  be  applied  to  another  set  of  users  and  achieve  80%  over 
random  guess. 

Conclusion 

Our  contributions  in  this  paper  include  a)  We  show  the  data 
of  a  novel  mobile  phone  based  experiments  on  the  app  in¬ 
stallation  behavior;  b)  We  illustrate  that  there  are  strong  net¬ 
work  effects  in  app  installation  patterns  even  with  tremen¬ 
dous  uncertainty  in  app  installation  behavior;  c)  We  show 
that  by  combining  measurable  networks  using  modern  smart 
phones,  we  can  maximize  the  prediction  accuracy;  d)  We  de¬ 
velop  a  simple  discriminative  model  which  combines  indi¬ 
vidual  variance,  multiple  networks  and  exogneous  factors, 
and  our  model  provides  prediction  accuracy  four  times  bet¬ 
ter  than  random  guess  in  predicting  future  installations. 

Future  works  include  the  causality  problem  in  studying 


Percentage  of  All  Subjects  Used  for  Training 

Figure  3:  The  MP  from  our  approach  and  two  comparison 
approaches.  We  here  set  k  for  MP  to  be  the  average  number 
of  users  in  G2  for  each  testing  app. 


network  phenomena  and  a  temporal  model  for  app  adoption. 
We  believe  the  former  one  can  be  done  with  a  much  care¬ 
fully  crafted  lab  experiments.  For  the  latter  one,  we  have 
attempted  multiple  temporal  adoption  models  but  failed.  We 
suspect  that  the  mechanism  of  temporal  diffusion  of  apps  is 
very  complicated,  and  we  leave  this  as  a  future  work. 

Though  our  convex  optimization  framework  is  fast  and 
reasonably  scalable,  it  should  be  noted  that  still  the  proposed 
method  in  this  paper  may  not  be  suitable  to  handle  data  from 
billions  of  cell  phone  users.  Potential  solutions  include  di¬ 
viding  users  into  small  clusters  and  then  conquering,  and 
sampling  users  for  computation.  The  scalability  problem  re¬ 
mains  a  future  work. 
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